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ABSTRACT 

This paper discusses Multi-Feature Hierarchical Sequential 
PAttern Mining, MFH-SPAM, a novel algorithm that ef- 
ficiently extracts patterns from students’ learning activity 
sequences. This algorithm extends an existing sequential 
pattern mining algorithm by dynamically selecting the level 
of specificity for hierarchically-defined features individually 
for each pattern. Consequently, MFH-SPAM operates on a 
larger space of patterns in the activity sequences. In this 
paper, we employ a differential version of MFH-SPAM to 
extract a small set of patterns that best differentiate stu- 
dents with different learning behavior profiles in the Betty’s 
Brain system. Our results illustrate that: (1) MFH-SPAM 
identifies important patterns missed by traditional sequence 
mining approaches; and (2) the differential patterns provide 
additional information for characterizing learning behaviors. 
This has implications for developing targeted and adaptive 
scaffolding in open-ended learning environments. 

1. INTRODUCTION 

Open-Ended Learning Environments (OELEs [4,7]) present 
students with a challenging problem-solving task, along with 
resources and tools for solving the task. Students have the 
choice to explore, and, therefore, can evolve their solutions in 
a variety of ways. In previous work, we proposed a theory- 
based approach called coherence analysis (CA) [7] for an- 
alyzing student behavior in OELEs. Experimental results 
showed that grouping students using the CA metrics pro- 
duced distinct behavior profiles that are discussed in greater 
detail in Sections 3 and 4. To date we have established the 
stability and usefulness of our CA measures across extended 


John S. Kinnebrew 
Department of EECS and ISIS 
Vanderbilt University 
1 025 1 6th Ave S, Ste 1 02 
Nashville, TN 37212 

john.s.kinnebrew@vanderbilt.edu 

Gautam Biswas 
Department of EECS and ISIS 
Vanderbilt University 
1 025 1 6th Ave S, Ste 1 02 
Nashville, TN 37212 

gautam.biswas@vanderbilt.edu 

periods of student work, which does not make this approach 
directly applicable to adaptive scaffolding as students work 
in the OELE. To address this problem, our goal has been 
to use sequence mining methods to find students’ activity 
patterns that are indicators of their behavior profiles. In 
this paper, we present a case study illustrating that action 
patterns derived using a novel hierarchical sequence mining 
approach followed by differential analysis enable classifica- 
tion performance on a par with the groupings derived using 
CA. Occurrence of individual action patterns can be easily 
detected online, and future work will assess their utility for 
early identification of behavior profiles and contextualized 
scaffolding in OELEs. 

In the Betty’s Brain OELE [5] each action performed by a 
student has a number of accompanying features that capture 
context and consequences of the action. In past work, we 
used pre-processing methods to select specific features and 
the level of granularity for each feature to generate ‘flat’ se- 
quences for pattern mining [2]. This largely ad hoc process 
resulted in our running many different mining analyses, but 
often missing potentially important patterns. Other work, 
such as Plantevit et al. [6], has addressed some aspects of 
the search in large feature spaces. They define a two-phase 
technique that first determines frequent combinations of fea- 
tures and levels of specificity in hierarchical representations 
to pre-processes multi-feature (hierarchical) sequences into 
a ‘flattened’ representation. While this approach provides 
clear advantages over numerous mining analyses with ad 
hoc feature and granularity choices, many frequent patterns 
can still be missed due to the initial flattening phase. To ad- 
dress this issue, we have developed a novel Multi-Feature, 
Hierarchical Sequential PAttern Mining algorithm (MFH- 
SPAM). 

MFH-SPAM extends the sequence mining algorithm SPAM [1] 
to simultaneously operate on the entire feature space of ac- 
tion sequences for pattern mining. In this work, we start 
with MFH-SPAM, and then apply a classifier wrapper method 
[3] to discover a small subset of mined patterns that are use- 
ful for differentiating students across the CA-derived learn- 
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ing behavior profiles. We have evaluated MFH-SPAM and 
other traditional sequence mining approaches in this behav- 
ior profile classification task using data from a recent study 
with the Betty’s Brain OELE. Results show that MFH- 
SPAM consistently outperforms traditional sequence min- 
ing approaches on this task. Further, the differential pat- 
terns provide additional information for characterizing stu- 
dent learning behaviors, which has implications for develop- 
ing targeted and adaptive scaffolding in OELEs. 

2. MFH-SPAM APPROACH 

Our approach to efficient mining of Multi-Feature, Hierar- 
chical (MFH) sequences extends the SPAM algorithm [1] 
by directly working with the MFH representation of actions 
during the mining process. To illustrate this representation, 
we consider a generic set of possible items/actions to make 
up sequences ( A , B, or C) with an additional feature ( e.g ., a 
measure of the action’s outcome) that can take on values of 
+ or — at the most general level. In this example, + values 
for the outcome feature can be further specified as either 
+Big or +Small at the next level of the hierarchy. There- 
fore, an individual action might be represented as B +Blg , 
and both f? +Blg and FJ+ Sma11 actions could be more gener- 
ally represented as B + by abstracting the outcome feature 
to the more general + level. Further, _B +Blg , J3+ Small ; and 
B~ actions could all be represented as simply a B action by 
ignoring the outcome feature entirely. We represent one ac- 
tion followed by another in a sequential pattern using the 
— > symbol, such as A — > B to indicate A followed by B. 
Itemsets (i.e., co-occurring items in the sequence) are sur- 
rounded with parentheses, such as (A, B) to indicate both 
A and B occurring at the same position in a sequence (i.e., 
simultaneously) . 

The core SPAM [1] algorithm searches the space of possi- 
ble sequential patterns by incrementally extending the cur- 
rent pattern (starting with an empty pattern) in a depth- 
first manner. For each pattern in the search, SPAM gener- 
ates the potential “child” patterns by applying one of two 
types of extensions to the current pattern: 1) a Sequence- 
extension step (S-step), which appends an item to the end 
of the sequence (occurring after the last item/itemset), or 2) 
an Itemset-extension step (I-step), which adds an additional 
item to the last itemset in the current pattern. For each pat- 
tern considered, SPAM calculates the number of sequences 
in which the pattern occurs using a vertical bitmap repre- 
sentation, explained in more detail later. If the number of 
sequences in which the new pattern is contained is less than 
the specified support threshold, SPAM rejects the pattern 
and does not consider any subsequent extensions to it. 

MFH-SPAM augments SPAM with two new pattern exten- 
sion steps in the pattern search: Feature extensions (F- 
steps) and Hierarchical extensions (H-steps). During an F- 
step, MFH-SPAM adds an additional feature to the last item 
of the current sequence using one of the most general values 
in the feature hierarchy. For example, the possible exten- 
sions to the pattern A —> B with an F-step would result in 
A -» B + or A — > B~ . During an H-step, MFH-SPAM selects 
the last feature of the last item of the current sequence and 
specifies its value at one level deeper in the feature hierarchy. 
For example, the possible extensions to the pattern A B + 
with an H-step would result in A — > f? +Blg or A — > £+ Sma11 . 
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In addition to these two new extension steps in MFH-SPAM, 
we define a corresponding extension to the vertical bitmap 
approach employed in SPAM to efficiently calculate the sup- 
port for a new pattern 1 . For each data sequence, SPAM ini- 
tially defines a bitmap for each possible item (e.g., A, B, and 
C) that represents the locations of that item in the sequence 
with a value of 1 (all other locations have a value of 0). For 
example, the sequence A — > B — > B would be represented 
with an A bitmap of [1 0 0], a B bitmap of [0 1 1], and a 
C bitmap of [0 0 0]. As SPAM generates patterns, it com- 
bines item bitmaps to produce pattern bitmaps in which l’s 
represent the endpoints of the corresponding pattern in the 
sequences. Consequently, for a trivial, single-item pattern 
like A, the pattern bitmap is exactly the same as the initial 
item bitmap. 

For an S-step extension of a pattern (e.g., extending A to 
A — » B), SPAM first transforms the current pattern bitmap 
([1 0 0]) to indicate where the extension to the current pat- 
tern could occur. This is performed by shifting the bitmap 
to make each location following the occurrence of a 1 in the 
pattern bitmap a 1 (indicating a candidate location for the 
additional item being added in the S-step) and making all 
other locations 0 (e.g., resulting in the bitmap [0 1 0]). In 
other words, A — > B exists in the sequence if B exists in the 
candidate location of the second position in the sequence. 
To complete the S-step (e.g., for A to A — B) SPAM per- 
forms a bitwise AND operation on the transformed pattern 
bitmap and the item (B) bitmap, resulting in the new pat- 
tern bitmap of [0 1 0] indicating that the pattern A — > B 
exists and ends at the second position in the sequence. 

We extend the SPAM bitmap procedure in F- and H-steps 
by first creating bitmaps for each possible feature value (at 
every level of the hierarchy) in the sequence, just as SPAM 
does with each possible item. Thus, if the original sequence 
were A~ — ¥ B +Bls — » we WO uld have a — bitmap 

of [1 0 0], a + bitmap of [0 1 1], a +Big bitmap of [0 10], 
and a +Small bitmap of [0 0 1[. The bitmap operations for 
F- and H-steps are then analogous to those for S-steps ex- 
cept without the bitmap shift 2 and using the feature value 
bitmap corresponding to the chosen extension. For example, 
applying an F-step to add the outcome feature with a value 
of + to the pattern A — > B, producing A — > B + , would 
correspond to [0 1 0] (the pattern bitmap) AND [0 1 1] (the 
feature value bitmap), giving the new pattern bitmap [0 1 
0], indicating that this pattern does occur in the example 
sequence and ends at the second position in the sequence. 
With the additional F- and H-steps, as well as correspond- 
ing bitmap operations for calculating support, MFH-SPAM 
extends SPAM to efficiently search the space of possible pat- 
terns in MFH sequences. Finally, to choose a small subset 
of the frequent patterns identified by MFH-SPAM (or by 
SPAM for the experimental comparison) that differentiate 
the pre-dehned learning profiles, we apply a classifier wrap- 


In the algorithm description, we describe only the case in 
which no gaps are allowed between items in the pattern, 
however, implementing more general gap constraints works 
in the same manner as with extensions to the original SPAM 
algorithm 

2 No shift is necessary because the candidate location is for 
adding further detail to the last item in the current pattern 
rather than adding an item after it. 



per method [3]. Using a greedy approach, the classifier wrap- 
per iteratively identifies the best pattern to include next 3 . 

3. DATA AND EVALUATION METHODS 

The data presented in this paper comes from a study of 98 
students from four middle school science classrooms using 
Betty’s Brain for six weeks [7]. Six coherence measures were 
employed to describe the quality and quantity of various 
problem-solving activities for each student, and hierarchi- 
cal clustering with these measures identified three primary 
clusters of students characterized by different behavior pro- 
files [7]. In total, 87 of the students fell into one of these 
three clusters, and the other 11 students exhibited behavior 
profiles indicative of either extreme confusion or disengage- 
ment. The primary clusters were defined as:(l) Frequent 
researchers and careful editors, who spent large proportions 
of their time viewing sources of information and did not 
edit their maps very often; (2) Strategic experimenters, who 
spent a fair proportion of their time viewing sources of infor- 
mation, but often did not take advantage of this information; 
and (3) Engaged & efficient students, who edited their maps 
very frequently, and usually supported by information from 
previous activities. 

To generate MFH activity sequences for mining, we catego- 
rized learning actions into seven primary categories, defined 
hierarchically (these categories are discussed in more de- 
tail in [2]): Reading resource pages; Searching the resources 
for keywords; causal Map Editing-, Querying the teachable 
agent, Betty; having Betty take a Quiz-, asking Betty to Ex- 
plain her answer; or taking Notes or causal link annotations 
( LinkEval ) indicating whether a link is believed to be cor- 
rect. To capture the context associated with these actions, 
we use additional features: (1) the “Length” dimension (ap- 
plied to Read actions) indicates whether the student spent 
enough time on the page to have read a significant amount 
of the material (Full) or only spent a brief period of time on 
the page (Short) [2]; (2) the “Previous (Full) Read” dimen- 
sion indicates whether the student has previously done an 
in-depth (“Full”) read of the page or not; (3) the “Supported” 
dimension indicates whether or not an EditLink action was 
based on either recently viewed reading materials or quiz 
results [7], with supported actions denoted by Sup and un- 
supported actions denoted by NoSup-, and (4) the “Map 
Score Change” dimension indicates what effect an EditLink 
action had on the quality of the student’s map - whether the 
quality improved (denoted by +), worsened (denoted by — ), 
or did not change (denoted by =). 

We evaluate our MFH-SPAM approach with comparison 
to four alternative approaches: Flattened Features (SPAM) 
first flattened all activity sequences using all features and 
the greatest level of action specificity and then used SPAM 
to generate candidate patterns (e.g., this approach would 
consider the pattern LinkRem + s U p — > LinkAdd" NoSup, but 
it would not consider the more general pattern LinkEdit 
— > LinkEdit); Actions-only (SPAM) considered only the fre- 
quent patterns at the most general level of specificity and did 

3 A limit of 10 patterns and an increase of at least 0.1% in 
performance over the previous pattern set was used in our 
implementation of the wrapper. A stratified 5-fold cross- 
validation approach was used for building the classifier in 
the wrapper with FI score for evaluation. 


not consider any additional features; MFH-SPAM Baseline 
by Frequency used our MFH-SPAM algorithm to generate 
candidate patterns and simply selected the 10 most frequent 
patterns; and Coherence Metrics classified students using 
the coherence measures. The performance of each approach 
was evaluated as the average FI score of the resulting classi- 
fier using 10-fold cross-validation. We chose decision trees as 
the classifiers and performed this analysis at mining support 
thresholds ranging from 1.0 to 0.5 in increments of 0.02. 

4. RESULTS 

• Actions-only(SPAM) -*• MFH-SPAM Baseline by Frequency 

-a- Flattened Features(SPAM) O MFH-SPAM 
■O Coherence Metrics 



Figure 1: Classification performances of MFH- 

SPAM and alternative approaches 

Figure 1 illustrates the performances of the classifiers built 
using the candidate feature sets mined in each approach. At 
each level of support, MFH-SPAM achieved an average FI 
score that was much higher than the scores produced by the 
other sequence mining methods. When using a particularly 
high mining support threshold, the Flattened Features ap- 
proach achieves performance close to that of MFH-SPAM, 
but its performance decreases dramatically as the support 
threshold is reduced (and the search space is increased). One 
striking result from this analysis is that MFH-SPAM’s per- 
formance is on par with the performance of the classifier 
trained with the features used to perform the original clus- 
tering that defined these behavior profile classes. Further, 
Table 1 presents the five patterns chosen most frequently 
across the 10 cross-validation folds at a support threshold 
of 0.9. Considering these top patterns, it is clear that the 
first three patterns could not have been identified without 
MFH-SPAM, as they involve multiple levels of hierarchies 
and feature specificity. 

Interestingly, the top MFH-SPAM patterns all involve var- 
ious forms of causal link edits. This suggests that the way 
a student went about building their map, as opposed to the 
way they navigated the resources and investigated Betty’s 
quiz results, was the most useful in predicting their overall 
learning behavior profile. However, the edit actions, through 
the support feature, can also incorporate the action’s rela- 
tionship to reading and quiz actions. In other words, what 
was most helpful in predicting a student’s cluster was not the 
way they acquired information (either from the resources or 
quiz results), but how they applied previously acquired in- 
formation to editing their maps. When comparing frequency 
of use across the three groups, their relative magnitudes are 
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Table 1: Pattern Frequency Mean (Std Dev) by Cluster for MFH Wrapper with Support 0.9 


Pattern 

Researchers 

Experimenters 

Efficient 

LinkRem + sup — > LinkEdit 

2.6 (2.5) 

3.6 (3.0) 

14.3 (8.4) 

LinkEdit + su P — > LinkAdd 

2.3 (1.9) 

2.5 (2.6) 

12.0 (6.5) 

LinkEdit + NoSup — > LinkEdit 

3.3 (2.9) 

16.4 (16.9) 

15.6 (12.3) 

LinkEdit - — » LinkEdit - 

3.7 (3.1) 

17.5 (16.0) 

18.3 (16.2) 

LinkAdd - 

15.3 (7.2) 

28.6 (12.1) 

43.5 (21.6) 


compatible with the behavior descriptions; e.g., researchers 
and careful editors make the least number of these edits; en- 
gaged & efficient students have the most; and strategic ex- 
perimenters fall in between. This confirms that the engaged 
& efficient students, who exhibited the best learning behav- 
iors and the largest learning gains [7], are broadly distin- 
guished from the other groups by more map editing overall: 
ineffective and effective; supported and unsupported. The 
usage distributions for these patterns also revealed inter- 
esting characteristics about strategic experimenters. These 
students performed patterns with supported edits far less 
frequently than engaged & efficient students. Conversely, 
they performed patterns with unsupported edits far more 
frequently than researchers and careful editors. Thus, even 
though the engaged & efficient students made several unsup- 
ported and ineffective edits, it would seem that their overall 
edit distribution is far more favorable to achieving better 
map scores (and in their case, better pre-post gains on do- 
main knowledge) than that of the strategic experimenters. 

To better characterize these three groups, we followed up on 
previous experimental results [2] and further analyzed the 
top behavior pattern: (1) LinkRem + s up — > LinkEdit that 
indicates an effective map correction behavior (removing an 
incorrect link with supporting evidence) followed by further 
editing. Overall, an average of 19% (s.d. 9%) of the en- 
gaged/efficient students’ total number of link edits involved 
this pattern versus 9% (s.d. 8%) for researchers/careful- 

editors and 9% (s.d. 7%) for strategic experiments. This 
behavior of incorporating effective map correction in periods 
of extended map editing appears to be a key characteristic 
of the engaged /efficient students. Further analysis also sug- 
gested that engaged/efficient students were relatively more 
likely to follow this pattern with a quiz to evaluate their re- 
vised map than the researchers/careful-editors and strategic 
experimenters. This may indicate a greater propensity for 
the engaged/efficient students to effectively combine evalu- 
ation of the causal map with map construction and correc- 
tion. In summary, going back to OELE characteristics, the 
engaged and efficient students seem to be better at explor- 
ing the problem-solving space, and in distinguishing correct 
and incorrect approaches to solving complex problems. 

5. DISCUSSION AND CONCLUSIONS 

MFH-SPAM provides a comprehensive approach to mining 
OELE activity sequences by efficiently covering the entire 
MFH action-feature space to generate patterns. Results 
showed that MFH-SPAM consistently outperforms tradi- 
tional sequence mining approaches on a behavior profile clas- 
sification task. Further, analysis of the MFH-SPAM pat- 
terns illustrated that a nice, compact way for differentiat- 
ing these student groups, while retaining high accuracy, was 
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in their approach to map construction and refinement us- 
ing various forms of editing actions. Overall, these results 
showed the importance of behavior patterns identified by 
MFH-SPAM and illustrated the potential to use these pat- 
terns to better characterize and ultimately scaffold student 
learning. In general, effective virtual agents for adaptive 
scaffolding in OELEs like Betty’s Brain may do well to fo- 
cus on behavior patterns to gain an understanding of how 
students’ apply their acquired knowledge (e.g., from reading 
the resources and studying quiz results) to build and refine 
models. Detection of specific suboptimal (not using acquired 
information well) or erroneous behaviors in this context may 
provide the needed cue for effective scaffolding. 
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